physical world
The Good Robot podcast: what makes a drone "good"? with Beryl Pong
The Good Robot podcast: what makes a drone "good"? Hosted by Eleanor Drage and Kerry McInerney, The Good Robot is a podcast which explores the many complex intersections between gender, feminism and technology. What makes a drone "good"? In this episode, we talk to Beryl Pong, UKRI Future Leaders Fellow at the University of Cambridge, where she leads the Centre for Drones and Culture. Beryl reflects on what it means to think about drones as "good" or "ethical" technologies and how it can be assessed through its socio-political context.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
- North America > United States > Alaska (0.07)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.68)
- Information Technology > Artificial Intelligence > Games > Go (0.63)
AI enables a Who's Who of brown bears in Alaska
AI enables a Who's Who of brown bears in Alaska Being able to distinguish individual animals - including their unique history, movement patterns and habits - can help scientists better understand how their species function, and therefore better manage habitats and study population dynamics. Today, most computer vision systems for tracking animals are effective on species with patterns and markings, such as zebras, leopards and giraffes. The task is much more complicated for unmarked species where individual differences are harder to spot. Distinguishing a particular brown bear from its peers in a non-invasive way requires an incredible eye for detail and years of viewing the same bears over time. What's more, these bears emerge from hibernation in the spring with shaggy fur and having lost quite a bit of weight and then substantially increase their body weight feasting on salmon, as well as fully shedding their winter coat - that's enough to throw off experts as well as AI algorithms.
- Research Report > New Finding (0.49)
- Research Report > Experimental Study (0.49)
Learning to see the physical world: an interview with Jiajun Wu
What is your research area? My research topic, at a high level, hasn't changed much since my dissertation. It has always been the problem of physical scene understanding - building machines that see, reason about, and interact with the physical world. Besides learning algorithms, what are the levels of abstraction needed by Al systems in their representations, and where do they come from? I aim to answer these fundamental questions, drawing inspiration from nature, i.e., the physical world itself, and from human cognition.
How can robots acquire skills through interactions with the physical world? An interview with Jiaheng Hu
How can robots acquire skills through interactions with the physical world? One of the key challenges in building robots for household or industrial settings is the need to master the control of high-degree-of-freedom systems such as mobile manipulators. Reinforcement learning has been a promising avenue for acquiring robot control policies, however, scaling to complex systems has proved tricky. In their work SLAC: Simulation-Pretrained Latent Action Space for Whole-Body Real-World RL, and introduce a method that renders real-world reinforcement learning feasible for complex embodiments. We caught up with Jiaheng to find out more.
Yann LeCun's new venture is a contrarian bet against large language models
Yann LeCun's new venture is a contrarian bet against large language models In an exclusive interview, the AI pioneer shares his plans for his new Paris-based company, AMI Labs. Yann LeCun is a Turing Award recipient and a top AI researcher, but he has long been a contrarian figure in the tech world. He believes that the industry's current obsession with large language models is wrong-headed and will ultimately fail to solve many pressing problems. Instead, he thinks we should be betting on world models--a different type of AI that accurately reflects the dynamics of the real world. He is also a staunch advocate for open-source AI and criticizes the closed approach of frontier labs like OpenAI and Anthropic. Perhaps it's no surprise, then, that he recently left Meta, where he had served as chief scientist for FAIR (Fundamental AI Research), the company's influential research lab that he founded. Meta has struggled to gain much traction with its open-source AI model Llama and has seen internal shake-ups, including the controversial acquisition of ScaleAI. LeCun sat down with in an exclusive online interview from his Paris apartment to discuss his new venture, life after Meta, the future of artificial intelligence, and why he thinks the industry is chasing the wrong ideas.
- Asia > China (0.05)
- North America > United States > New York (0.05)
- North America > United States > California (0.05)
- (4 more...)
Google Gemini Is Taking Control of Humanoid Robots on Auto Factory Floors
Google DeepMind and Boston Dynamics are teaming up to integrate Gemini into a humanoid robot called Atlas. Google DeepMind is teaming up with Boston Dynamics to give its humanoid robots the intelligence required to navigate unfamiliar environments and identify and manipulate objects--precisely the kinds of capabilities needed to perform manual labor. The collaboration, announced at CES in Las Vegas, will see Google's Gemini Robotics model deployed on various Boston Dynamics' robots, including a humanoid called Atlas and a robot dog called Spot . The companies plan to test Gemini-powered Atlas robots at auto factories belonging to Hyundai, Boston Dynamics' parent company, in the coming months. The move is an early look at a future where humanoids are able to quickly master a wide range of tasks.
- North America > United States > Nevada > Clark County > Las Vegas (0.25)
- Asia > China (0.05)
- North America > United States > California > Los Angeles County > Los Angeles (0.05)
- (2 more...)
Prediction with Action: Visual Policy Learning via Joint Denoising Process
Diffusion models have demonstrated remarkable capabilities in image generation tasks, including image editing and video creation, representing a good understanding of the physical world. On the other line, diffusion models have also shown promise in robotic control tasks by denoising actions, known as diffusion policy. Although the diffusion generative model and diffusion policy exhibit distinct capabilities--image prediction and robotic action, respectively--they technically follow similar denoising process. In robotic tasks, the ability to predict future images and generate actions is highly correlated since they share the same underlying dynamics of the physical world. Building on this insight, we introduce \textbf{PAD}, a novel visual policy learning framework that unifies image \textbf{P}rediction and robot \textbf{A}ction within a joint \textbf{D}enoising process. Specifically, PAD utilizes Diffusion Transformers (DiT) to seamlessly integrate images and robot states, enabling the simultaneous prediction of future images and robot actions.
Language Models Meet World Models: Embodied Experiences Enhance Language Models
While large language models (LMs) have shown remarkable capabilities across numerous tasks, they often struggle with simple reasoning and planning in physical environments, such as understanding object permanence or planning household activities. The limitation arises from the fact that LMs are trained only on written text and miss essential embodied knowledge and skills. In this paper, we propose a new paradigm of enhancing LMs by finetuning them with world models, to gain diverse embodied knowledge while retaining their general language capabilities. Our approach deploys an embodied agent in a world model, particularly a simulator of the physical world (VirtualHome), and acquires a diverse set of embodied experiences through both goal-oriented planning and random exploration. These experiences are then used to finetune LMs to teach diverse abilities of reasoning and acting in the physical world, e.g., planning and completing goals, object permanence and tracking, etc. Moreover, it is desirable to preserve the generality of LMs during finetuning, which facilitates generalizing the embodied knowledge across tasks rather than being tied to specific simulations. We thus further introduce the classical elastic weight consolidation (EWC) for selective weight updates, combined with low-rank adapters (LoRA) for training efficiency. Extensive experiments show our approach substantially improves base LMs on 18 downstream tasks by 64.28% on average. In particular, the small LMs (1.3B, 6B, and 13B) enhanced by our approach match or even outperform much larger LMs (e.g., ChatGPT).
Isometric 3D Adversarial Examples in the Physical World
Recently, several attempts have demonstrated that 3D deep learning models are as vulnerable to adversarial example attacks as 2D models. However, these methods are still far from stealthy and suffer from severe performance degradation in the physical world. Although 3D data is highly structured, it is difficult to bound the perturbations with simple metrics in the Euclidean space. In this paper, we propose a novel $\epsilon$-isometric ($\epsilon$-ISO) attack method to generate natural and robust 3D adversarial examples in the physical world by considering the geometric properties of 3D objects and the invariance to physical transformations. For naturalness, we constrain the adversarial example and the original one to be $\epsilon$-isometric by adopting the Gaussian curvature as the surrogate metric under a theoretical analysis. For robustness under physical transformations, we propose a maxima over transformation (MaxOT) method to actively search for the most difficult transformations rather than random ones to make the generated adversarial example more robust in the physical world. Extensive experiments on typical point cloud recognition models validate that our approach can improve the attack success rate and naturalness of the generated 3D adversarial examples than the state-of-the-art attack methods.
The 5 coolest gadget innovations of 2025
We may earn revenue from the products available on this page and participate in affiliate programs. Deep down, we want to be cyborgs. We spend huge chunks of time interacting with technology every day, but the friction created by devices and interfaces persists. This year, we got closer than we have been to tech that truly augments reality. Meta took its smart glasses beyond its beginning as a simple content creation tool.